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Abstract 



The architecture of the eukaryotic genome is characterized by a high degree 
of spatial organization. Chromosomes occupy preferred territories correlated 
to their state of activity and, yet, displace their genes to interact with re- 
mote sites in complex patterns requiring the orchestration of a huge number 
of DNA loci and molecular regulators. Far from random, this organization 
serves crucial functional purposes, but its governing principles remain elu- 
sive. By computer simulations of a Statistical Mechanics model, we show 
how architectural patterns spontaneously arise from the physical interaction 
between soluble binding molecules and chromosomes via collective thermo- 
dynamics mechanisms. Chromosomes colocalize, loops and territories form 
and find their relative positions as stable thermodynamic states. These 
are selected by "thermodynamic switches" which are regulated by concen- 
trations/affinity of soluble mediators and by number/location of their at- 
tachment sites along chromosomes. Our "thermodynamic switch model" of 
nuclear architecture, thus, explains on quantitative grounds how well known 
cell strategies of upregulation of DNA binding proteins or modification of 
chromatin structure can dynamically shape the organization of the nucleus. 
Key words: chromatin organization; statistical mechanics; computer simu- 
lations; thermodynamics 
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Introduction 

Within the cell nucleus, genome structure has a complex organization in 
space spanning different scales. Chromosomes tend to form a set of distinct 
territories and, at a smaller level, are folded in higher-order structures, while 
a variety of physical intra- and inter-chromosomal interactions between spe- 
cific DNA sequences has been reported |l|, 2, 0, 0, @, 0). While structures 
can be formed by tethering specific DNA segments to scaffolding elements, 
such as the nuclear envelope, DNA-DNA contacts and chromatin loops are 
an ubiquitous organizational feature extending up to hundreds of kilobases, 
and relocating, for instance, genes to substantial distances outside of their 
territory. Intriguingly, relative positions of territories, as well as of DNA 
sequences within a territory, have a probabilistic nature dynamically chang- 
ing with cell type and cell cycle phase. Yet, stable, non-random patterns 
are established, fundamental to genome regulation, as disruptions relate to 
serious diseases, most notably, cancer Q, H, 13, IE, M) ■ Remarkably common 
features are shared in chromatin organization processes, but the underlying 
principles of their control in space and time are still largely mysterious (0). 

While there is evidence that far apart DNA sequences, even on different 
chromosomes, can come together by interacting with molecular factors, the 
mechanisms whereby they do so and higher-order structures and territories 
arise are still largely mysterious. One of the scenarios proposed to explain 
the establishment of contacts between DNA elements is the so called 'ran- 
dom collision' picture (see, e.g., (0)) whereby chromatin flexibility allows 
factors bound to one sequence to randomly contact factors bound to sur- 
rounding chromatin. Although active mechanisms of directed motion have 
been described (see, e.g., |8|)), diffusion-based mobility is indeed a prevailing 
mechanisms that delivers molecular complexes to their specific nuclear tar- 
gets (see (0) and ref.s therein). So, loops could be formed when a diffusing 
factor succeeds in bridging two chromosomal sites as a result of a "random 
double encounter" , whereby the molecule by chance encounters its first bind- 
ing site and then, by chance, the second one. Yet how such loops persist 
beyond the initial 'random collision' is totally unclear {?], fiol ) and many 
questions remain open: how strong are the bonds required to hold in place 
whole chromosomal segments? How are stochastic encounters coordinated 
in space and time for a functional purpose by the cell? Can higher-order 
structures and territories spontaneously arise from them? Here, by use of a 
polymer physics model we propose a scenario to answer such questions. 

Sequence-specific DNA-binding molecular factors have emerged as criti- 
cal regulators of chromatin interactions in the nucleus |l|, 0, 0, |j, d, @) and 
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some of them are encountered in a variety of cases, as for instance SATB1 
(fill). Ikaros |H), PcG jH), and CTCF Zn-finger proteins, the latter known 



to mediate also interchromosomal contacts ([14J, Ilia, |l6J, Il7l ) . In some cases 
a combination of factors is required to induce looping, as in the example of 
the erythroid transcription factor GATA-1 and its cofactors at the /?-globin 
locus (|18l . Il9l). Analogously, GATA-3 and STAT6 cooperation has been pro- 
posed to establish long-range chromatin interactions at the TH2 cytokine 
locus (|20l ). Transcription factories themselves, i.e., local high concentrations 
of Pol II, have been proposed to act as hubs in the formation of loops and 
the colocalization of distant genes, even outside chromosome territories (see 
({5!. I21L I22I) and ref.s therein). In the last few years, protein-DNA interactions 
that occur in vivo have been probed by innovative genome-wide techniques 
leading to the description of thousands of binding sites for DNA binding 
proteins (@), and systematic approaches to measuring their binding energy 
landscapes are being developed (1241 ) . DNA binding proteins typically ex- 
hibit a number of target loci, which can be found clustered in groups. Their 
DNA chemical affinities are in general found in the weak biochemical energy 

(k is the Boltzmann constant 



i25i, im 123, 12. 



range, E x ~ 1 4- 15kT fl2J 
and T room temperature). Although in most cases only qualitative informa- 
tion is available, details on binding energies and DNA locations have been 
clearly described for a number of examples (see (24, 25, 26, 27, 28|) and Ref.s 
therein). Initial works on bacteria have shown that DNA binding proteins 
can have hundreds of DNA sites with affinities in the range 2 -j- 15/cT ((0) 
and ref.s therein). In yeast, more recently, the landscape of binding energies 
and loci has been explored by advanced computational biophysics methods: 
the distribution of their binding energies spans a range of about lOkT, and 
they can have hundreds of DNA binding sites across the genome as well 
(see ([2^) and ref.s therein). Similar ranges in binding energies have been 



2J, l25j) and 



found in higher eukaryotes, including mice and humans (([2c 
ref.s therein), where common examples exist of proteins with thousands of 
DNA target sequences. 



DNA-DNA interactions mediated by molecular factors are being exten- 
sively mapped, revealing a complex network of intra and interchromosomal 
interactions (|29l ). Clusters of binding sites of SATB1 (I 111 . I30T). and zinc fin- 



ger class proteins CTCF (|3l|,|32j), Ikaros flTJ) and GATA-1 (J33|) were found 
in a number of regions involved in DNA cross talk. An important example 
is the cluster of CTCF binding sites responsible for X chromosome pairing, 
at the onset of X Inactivation, located at the Xist/Tsix locus where, in a few 
kb short sequence, a group of about hundred binding sites, each 20b long, 
is found , I31I ) . Expansion of the nuclear volume leads to the disassembly 
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of several nuclear compartments (|34l ) which might suggest that specific con- 
centrations of macromolecules are required for the self-assembly of nuclear 
structures. Loss of specific interchromosomal DNA-DNA contacts has been 
described after a marked reduction, for instance, of the amount of CTCF 
( 15l . 17}. Changes in the concentration of "heterochromatin" proteins, e.g., 
HP1 (|35l . l36l ). are also known to affect the organization of genomic DNA 
(0). 

The conformational properties of chromosomes have been investigated 
by using polymer models in the past (jH, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
481 . l49l ) . The chromatin fiber was modeled as a random walk in a confining 
geometry (0), and the possibility was considered to include giant loops, 
of about 3 Mb, departing from its backbone to describe folding at differ- 
ent scales (RWGL model (39,0)). The multi-loop-subcompartment MLS 
model (41, 42|) aimed to represent 'rosette' structures, with 120kb loops, like 
those experimentally observed. To describe the radial arrangement of chro- 
mosome territories in human cell nuclei, a model was proposed |43l ) where 
each chromosome is approximated by a linear chain of spherical 1 Mbp-sized 
chromatin domains. Adjacent domains are linked together by an entropic 
spring and by an effective excluded volume potential, while to maintain 
the compactness of chromosome territories a weak potential barrier around 
each chromosome chain was also included. Recently, the "Random Loop" 
polymer model ()0) has introduced the idea that a set of randomly located 
sites along a random walk chain can bind each other, in order to explain, 
at the same time, the experimentally observed presence of loops of different 
scales and the leveling-off of the mean square distance between two beads 
of the chain at genomic distances above l-2Mb. Several other chromatin 
features have been successfully explored by computer simulations, including 
nucleosome interactions (44), packing (45, 4fj|), molecular assembly (48, 49). 
providing a vivid description of the geometry and conformational properties 
of chromatin as observed in experiments. 

Here, by investigation of a polymer physics model inspired by the above 
biological scenario, we discuss how architectural patterns spontaneously 
arise from the interaction of soluble binding molecules and chromosomes. 
Our model shows that thermodynamics dictates pathways to complex pat- 
tern formation: loops, colocalization of distant sequences, chromosomal do- 
mains, structures and territories spontaneously organize as stable thermo- 
dynamic states when specific threshold values in molecule concentrations or 
their affinity to DNA sites are exceeded. By regulation of expression levels 
and modification of DNA targets, the cell can, thus, act on "thermodynamic 
switches" (j5ol . 0) to reliably control its genome organization in space and 
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time. 



Theoretical Model 

To describe a system made of a chromosome and its binding molecules, we 
consider an established model of polymer physics (52, 53|): the chromosome 



polymer is modelled as a Brownian self-avoiding walk (SAW) of n non- 
overlapping beads, and soluble molecules as Brownian particles having a 
concentration, c (see FigH]). A fraction, /, of polymer sites can bind the 
diffusing molecules, with a chemical affinity Ex in the weak biochemical 
range (see Methods for details). Here, for sake of simplicity, binding sites 
are uniformly interspersed with non-binding regions along the chain. Each 
molecular factor can simultaneously bind many a site on the polymer, a 
feature that reflects the presence of multiple DNA binding domains in a 
number of regulatory proteins (e.g., CTCF). Mediating molecules with only 
one DNA binding site, that are able to interact with each other, could be 
also considered; since a group of linked molecules can be represented, in 
the model, as just one mediator, the picture is unchanged. The equilibrium 
thermodynamic properties of such a system were determined by extensive 



Monte Carlo simulations (|53j, 1 54 ) 



Methods 

In our Monte Carlo computer simulations (|53l ) molecules and polymers dif- 
fuse in a cubic lattice having a linear size L, and its spacing, do, sets the 
space unit. For computational purposes, we mostly consider lattices of lin- 
ear size L = 32, though, we tested our results up to L = 128. SAW polymer 
beads have a diameter, do, and each bead in a chain is on a next or nearest 
next neighboring site of its predecessor. Molecules (of size do) are also sub- 
ject to Brownian motion. When neighboring a binding site of a polymer, 
molecules interact with it via an effective energy, Ex- According to the 
studied case (see Results), up to six distinct sites (i.e., the nearest neighbors 
in a cubic lattice) on the same chromosome, or alternatively two sites on 
different chromosomes can be bound at the same time. 

Our schematic model is a coarse-grained description of a real polymer 
and, since by now we mostly focus on the description of a general concep- 
tual framework, beads only represent generic binding sites (they could be a 
binding locus, the bases of specific binding sequences, etc.). In cases where 
detailed data on binding sequences and regulator chemistry is available, such 
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information could be easily taken into account in the model to produce spe- 
cific quantitative predictions. The role of interactions with, e.g., the nuclear 
membrane could be also included, but to make the message simpler, we 
decided not to discuss such an aspect here. 

To obtain thermodynamic equilibrium configurations, the Metropolis 
Monte Carlo method was applied. Chromosome polymers are initially equili- 
brated in a random self-avoiding configuration obtained, in absence of bind- 
ing molecules, by random displacements of single beads under the constraint 
that each bead in the chain is on a next or nearest next neighboring site 
of its predecessor. Then molecules are inserted at random empty positions 
in the lattice to attain a given concentration. In the ensuing Metropo- 
lis Monte Carlo procedure, a sequence of states is generated by a Markov 
process |H3) whereby a new position for a particle/bead is stochastically 
selected according to a specific transition matrix satisfying the 'principle of 
detailed balance' which in turn guarantees the convergence in probability of 
the sampled states to Boltzmann thermodynamic equilibrium distribution. 
The transition probability for a particle/bead to diffuse to a neighboring 
empty site is proportional to the Arrhenius factor r$ exp(— AE/kT), where 
AE is the energy barrier in the move, k the Boltzmann constant and T the 
temperature (1531 ). The lattice has periodic boundary conditions to reduce 
boundary effects. 

In a Monte Carlo lattice sweep every particle and bead in the system, 
randomly selected, is updated on average once. Our simulations run for up 
to 10 9 Monte Carlo lattice sweeps as the number of decorrelation steps from 
an initial configuration can be as large as 10 5 . The achievement of stationar- 
ity was monitored by checking the dynamics of different quantities, such as 
the system gyration radius, the distance between two polymers, the system 
energy and the number of particles attached to polymers. Once equilibrium 
is reached for all these quantities, thermodynamic averages are calculated 
by considering only configurations having a distance larger than the decor- 
relation length. Finally, averages are also performed over up to 2048 runs 
from different initial configurations. Confidence intervals are calculated as 
squared deviations around these averages, as discussed in (jHsl); they are 
indicated in our figures by the size of the used symbols. 

Our code has two core routines, well described in Binder and Heermann 
|H3): the "lattice gas" spin-exchange Metropolis routine for particle displace- 
ment, and the Self Avoiding Walk routine. Several means were considered to 
avoid algorithmic errors, as those suggested in |H3)- Each different routine 
in the code was tested independently. For example, the routine generating 
the evolution of the Self Avoiding Walk chain was tested by checking the 
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behavior of the calculated average gyration radius, R g , against the chain 
length, n, and the power law R 9 y n v with an exponent v ~ 0.6, well es- 
tablished in the literature (52|, 153 ). was recovered. An other internal test 
was to show that other geometric quantities, such as the chain end-to-end 
distance did scale in the same way as R g . 

Real chromosomes differ in size (i.e., n) and arrangement of their binding 
sites. Such differences affect their specific behaviors, but the general picture 
we aim to depict here is not altered by changes to the selected values of these 
parameters (e.g., n and L). To make computation time feasible, we mostly 
use n = 64, but we tried n as large as 128. The robustness of our model 



is well established in polymer physics (|52j, l53l ). and to check the effects of 
finite size scaling we explored changes of the polymer chain length in the 
range n € {16,32,64,128} (see Results). 



Results 



Intrachromosomal interactions, loop and territory formation 

We first discuss how a chromosome can fold up in loops within a territory 
with a specific spatial conformation by interacting with soluble molecules, 
and how the process can be controlled by the cell (see FigJT]). The folding 
state of the polymer is illustrated by its squared radius of gyration, Rg 
(jH2): R 2 g = l/(2n 2 J\f)Yllj=i(ri - r,) 2 , where r» is the position of bead 
i € {l,...,n}, and AT a normalization constant (here M equals the average 
squared gyration radius of a randomly floating SAW chain of size n). R g 
represents the radius of a 'minimal' sphere enclosing the polymer: it attains 
a maximum when the polymer is loose and randomly folded, and a minimum 
when loops enclose it in a compact lump. 

In presence of a given concentration of molecules, loops could be created 
by chance when a particle bridges a couple of chromosomal sites having 
a non zero affinity, Ex- Figj2] left panel shows, indeed, that R 2 attains 
a small plateau value when Ex is large enough (say above the inflection 
point, Ef r , of the curve R 2 (Ex))'- bridges are thermodynamically favored 
and the polymer takes a compact looped territorial conformation, as seen in 
a typical 'snapshot' from computer simulations depicted in FigfTJright panel. 
The system behavior, however, switches for Ex < Et r , since R 2 keeps its 
maximal value corresponding to a fully open polymer floating in space (see 
FigQ] left panel) and no stable loops are formed. The folding level also 
depends on factors such as concentration of molecules, number and location 
of DNA binding sites (see below). 
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The above results have an intuitive basis: if Ex is small the half-life of 
a randomly formed bridge is small and polymer segments on average float 
away; the higher Ex, the higher the number of bound molecules and, thus, 
of bridges which reinforce each other and stabilize the conformation, as 
multiple bonds should be simultaneously broken to release a loop. Our 
physics model reveals, in particular, that a precise threshold marks the 
switch between the two regimes; E± r corresponds to a thermodynamic phase 
transition |HB), as discussed later on. This picture illustrates on quanti- 
tative grounds how chromatin modifications, such as DNA methylation or 
post-translational modifications of DNA binding proteins (well described cell 
strategies to change genomic architecture), can result in dramatic, switch- 
like, effects. 

In a different thermodynamic pathway to loop formation, the cell can 
regulate the concentration, c, of binding molecules. The plot of Rg(c) (Figj2] 
central panel) shows how c affects the compaction state of the polymer. 
When c is below the threshold, Q r , R 2 g (c) has a value corresponding to 
random folding, while above c± r , it decreases towards its "looped state" 
value. A broad crossover region is found around revealing that Rg, 
which can be envisioned in our example as the radius of the "territory", 
can be tuned across a range of values. So, the regulation of a DNA binding 
protein concentration (a typical event in cellular behavior) can act as an 
other switch to reliable assembling of genomic architectures. 

Finally, we find (FigJ21 right panel) that a minimal threshold in the 
number of polymer binding sites (or in their fraction /) is required for sta- 
ble looping/territory formation. Conceptually, the case of a polymer with 
a low number of binding sites is equivalent to the case of a polymer with 
many binding sites in the presence of a limiting concentration of media- 
tors. The function Rg(f) indicates that a "thermodynamic switch" to DNA 
compaction resides in the potential to obliterate/restore a fraction of sites 
via chromatin modifications that abolish binding of the relevant regula- 
tory molecule. Intriguingly, the presence of a thermodynamic threshold in 
/ could relate to the experimental observation that multiple binding sites 
for mediators have been found at chromosome interaction loci and loop- 
ing points (e.g., CTCF mediated interactions). Importantly, in our model 
we find that the threshold value, fth, is a strongly decreasing function of 
the binding energy, Ex- This can be expected as, for an above threshold 
mediator concentration, c, the overall binding energy linking two polymer 
strands is approximately fEx', so an increase of Ex would correspond to an 
inversely proportional reduction of ft r . 

The above described "thermodynamic switches" define a robust regula- 



Thermodynamic pathways to genome organization 



9 



tory mechanisms as seen in the phase diagram of FigJH reporting the equi- 
librium state of the chromosome (open vs looped) in a wide range of Ex and 
c values (for a given /). In particular, FigU] shows that the threshold value 
E tr {c) (dashed line) required for loop assembly decreases as c increases and 
can be as weak as an hydrogen bond. In the cell, the possibility to drive 
looping by use of sites with even low binding energy for their soluble ligands 
could be important to prevent polymers from getting stuck in topologically 
unacceptable entanglements or ectopic associations, since each single low 
energy bond can be easily broken for adjustments. 

The threshold values in the (c, Ex f) space (see FigUJ), related in poly- 
mer physics to the chain 0-point (|52i ). correspond to a phase transition 
occurring in the system when one of two competing thermodynamics mech- 
anisms prevails: entropy, S, which favors loose random folding, or energy, 
E, which increases when bonds between molecules and DNA sites are es- 
tablished by loop formation. The system spontaneously tends (as it is finite 
sized (|55l )) to select the state where its Free Energy, F(c, Ex, f) = E — TS, 
is minimized. More precisely, the chromosome conformation has a specific 
stochastic distribution (having a width which can be very narrow) following 
from Boltzmann thermodynamics weights (|55l). 



Scaling behaviour of the model 

As molecule binding regions on 'cross-talk' loci of real chromosomes have 
variable sizes, n, we explored the 'scaling behaviour' of our system by varying 
the polymer chain length in the range n G {16,32,64,128}, for the above 
value of the containing box size L. The reference case considered previously, 
and in the rest of the paper, has n = 64, which is comparable to values 
included in similar studies (41, 4^, 431 ). 



We investigate, in particular, how the average gyration radius, R g , and 
the threshold energy, Et r , depend on n. For a matter of clarity, we refer to 
the case discussed in the left panel of Fig|2j but similar features are found 
for the other cases presented in our paper. We, thus, consider a system with 
c = 0.04% and / = 1/3, and discuss first the case where Ex = lkT, i.e., 
the phase where the polymer is "open" (see Fig|2]left panel). Under these 
circumstances, as shown in the lower panel of Figj3l R g scales with n as a 
power law, R g ~ n u , with an exponent v ~ 0.6 which is in agreement with 
the random SAW scaling laws (pil . l53l ). Conversely, for Ex = 4kT, i.e., in 
the "looped" phase, R g scales as n 1 / 3 (see lower panel of Figj3]), showing 
that the polymer is lumped in a compact conformation (1/3 is the inverse of 
the Euclidean dimension of the system). The threshold energy Et r has also 
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a comparatively simple behavior with n and appears to saturate to a finite 
value for large n. For instance, the threshold energy defined in the left panel 
of Figj2] (where 2^,. (64) ~ 3kT) can be well fitted by a power law in n (see 
upper panel of FigEJ): E tr {n) = E™+A/n B , where E tr (n) is the value for a 
chain of size n, E^ the fitting value for an infinitely long chain, A and B a 
fitting coefficient and exponent (we find E% ~ 0.96£ tr (64), A ~ 0.47£ ir (64) 
and B ~ 0.5). 

Similar properties are found for the other quantities discussed in this 
paper. These checks outline the robustness of the picture discussed above 
and also support the idea that it is not an artifact of discretization, as a 
system in the continuum limit, i.e., on a finely divided lattice, should have 
an analogous behavior. 

Interchromosomal segment interactions 

The mechanisms that drive other layers of spatial organization, including 
the colocalization of DNA sequences belonging to different chromosomes 
(jHlJ) and the relative positioning of chromosomal territories (0, aaaaa, 
can be shown to be very similar to those inducing stable loop formation 
within a single chromosome. Concentration/affinity acts in these cases as 
a "thermodynamic switch" for segment colocalization and for chromosome 
positioning in a map. 

To such an aim, in an extension of the model described above, we now 
investigate the thermodynamic state of two SAW chains (representing ei- 
ther two distal sequences on the same chromosome or sequences on dis- 
tinct chromosomes) with a fraction / of binding sites (periodically placed) 
for a concentration, c, of molecules having an affinity, Ex, to both of 
them (see Figj5]); for simplicity, each molecule can bind once either poly- 
mer. The relative polymer positioning is given by their squared distance: 
d 2 = l/(2n 2 P) XXj=l( r i ~~ r j 2 ) 2 > wnere rf' (resp. rj 2 ') is the position of 
bead % in chromosome 1 (resp. 2), and T> a normalization constant (here 
V is equal to the average square distance of two independent random SAW 
chains). The average value of d 2 is maximal when polymers float indepen- 
dently (i.e., d 2 = 1 in our normalization) and decreases drastically when all 
or parts of the chains become colocalized. 

Regulation of Ex can induce formation or release of stable physical con- 
tact between the polymers. Fig|6] shows that when Ex is below a threshold, 
Et r , their equilibrium distance, d 2 , has the same value found for two non- 
interacting Brownian SAW chains (i.e., d 2 = 1). This is the 'random phase' 
where chromosomes move independently. By thermodynamics mechanisms 
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an effective attraction between the polymers is, instead, established when 
Ex > Ef r : physical contact is stable and d 2 drastically decreased, as the 
system enters the 'colocalization phase'. The equilibrium distance is a func- 
tion of c as well (see Figj6]): when c is below a threshold value, ct r , a random 
distance is found between chromosomes (i.e., d 2 = 1). Colocalization is spon- 
taneously attained, instead, when c increases, as d 2 {c) approaches a plateau 
with a much smaller value. Finally, for a given c and Ex, colocalization can 
be achieved only if the number of binding sites along the polymers is above 
a sharp threshold value, as shown in Figj6] where d 2 (f) is plotted. 

Alike loop architecture within a chromosome territory, the average dis- 
tance of chromosome pairs can be controlled via thermodynamics mecha- 
nisms. The spatial association is attained when a phase transition line is 
crossed, corresponding to the point where entropy loss due to chain pairing 
is compensated by energy gain as both polymers are bound, the lower Ex 
the higher the concentration, c, required. 

Assembly of chromosome territorial maps 

Within the above picture, the relative positioning of chromosomal loci and 
territories can be understood by similar arguments. As an example, we 
considered (see inset in FigJ7|), the case with three SAW chains (n = 64) 
having each a fraction, /, of binding sites (/ = 1/2, Ex = 4:kT): the 
sites on polymer 1 and 2 interact with a molecular factor (concentration 
C12) which can bind once either chain; polymer 2 and 3 bind a different 
molecular factor of concentration C23 (for defmiteness, we only discuss the 
case where C12 = C23 = c). In order to illustrate the important effects 
of physical interference between chromosomes, in this model all molecular 
factors compete for the same sites on polymer 2. For the built in symmetry, 
polymer couples 1-2 and 2-3 behave similarly and have, on average, equal 
relative distances d\ 2 = d\ z as a function of c (see FigjT]). Yet, since polymer 
1 and 3 physically interfere when bridging with 2, in a competition for its 
binding sites, their distance is larger than the one found in the case with 
only two polymers under similar conditions (i.e., same c, Ex, f and system 
size). The distance between 1 and 3, d%3, is in turn larger than d\2 = g?23 
because there is not a direct interaction. The three 'chromosomes', thus, 
spontaneously find their position to form a (isosceles) 'triangle' having sides 
of predefined length (di2, (^3, ^13)- 

Different patterns of relative positions can be attained by tuning the 
concentration/affinity switches, as the system architecture self-organize via 
thermodynamics pathways, funneling the interaction between sets of DNA 
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binding sites and matching molecular mediators. When the number and 
length of chromosomal segments increase, the dynamics of the system to 
equilibrium can be slowed down by physical hindrance. This rises the spec- 
ulation that the spatial organization of chromosomes in distinct territories 
and within territories (along with other mechanisms, e.g., the action of topoi- 
somerases) may also serve the purpose of a faster and better control of their 
interaction and function, by reducing undesired entanglements. 

Discussion 

Within the cell nucleus, in a striking example of self-organization, an aston- 
ishing number and diversity of DNA loci and molecular mediators are spa- 
tially orchestrated to form a complex and functional architecture involving 
regulatory cross talk between distant sites. We propose a simple conceptual 
framework, a "thermodynamic switch model" of nuclear architecture, to un- 
derstand some of its general features, namely Q): 1) how a chromosome 
can fold up into a territory and how its looping is dynamically controlled by 
binding molecules; 2) how chromosomes interact and establish their relative 
positioning; 3) what are the regulatory principles and 4) the origin of the 
stochastic character of territorial maps. 

Our model consists of a system of Self-Avoiding Walk polymers inter- 
acting with soluble molecular mediators. By use of Statistical Mechanics, 
we have shown that thermodynamics dictates pathways to complex pattern 
formation, via mechanisms such as "thermodynamic switches" (see FigJH]). 
This supports, on quantitative grounds, the idea that a variety of intra- 
and inter-chromosome interactions can be traced back to similar mecha- 
nisms. Looping and compaction, remote sequence interactions and territo- 
rial segregated configurations correspond to thermodynamic states selected 
by appropriate values of concentrations / affinity of soluble mediators and by 
number and location of their attachment sites along chromosomes. After 
proper concentrations/affinities are set, the organization proceeds sponta- 
neously with no energetic costs as the resources required, e.g., to rearrange 
even whole chromosomes, are provided by the surrounding thermal bath. 
Our picture explains, thus, how well described cell strategies of upregula- 
tion of DNA binding proteins or modification of chromatin structure can 
shape the genomic architecture and produce DNA colocalization and terri- 
tories according to thermodynamically driven non random patterns. 

Testable quantitative predictions are shown on the biological effects of 
alterations of genomic DNA sequences (such as deletions, insertions, chemi- 
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cal changes, etc.) and of their molecular mediators (concentrations, binding 
energies, etc.). In particular, the model highlights the fact that, at above 
threshold values of concentration, the interaction with low affinity molecu- 
lar factors may be sufficient to drive the compaction of chromosomes into 
territories, and shows that the interaction of chromosomes with soluble me- 
diators has the potential to impart a probabilistic relative arrangement to 
chromosomes. Our analysis reveals that molecular factors that act as bridges 
between two chromosomes may not only have the effect of pulling those close 
to each other, but may also displace non interacting chromosomes, so that 
these are farther away from each other than the "random" distance. This re- 
sult is thought-provoking in the light of experimental data (56) showing that 
disruption of transcription can lead either to an increase or to a decrease of 
chromosome intermingling among specific couples of chromosomes, depend- 
ing on what couple of chromosomes you look at. Allele-specific, parent- 
of-origin specific, and ex pre ssion-specific DNA-DNA interactions have also 
been described (flR 0, 57, [H, 0). In this context, our analysis could 



explain how imprinting and other allele-specific protein-DNA interactions 
may have the capacity to address homologous chromosomes to two different 
regions of the territory map. 

A rough estimate of threshold molecular concentrations in real nuclei can 
be made from our predicted concentration values: here c is the number of 
molecules per lattice site, so the number of molecules per unit volume is c/djj, 
where do is the linear lattice spacing constant. The molar concentration p 
is obtained by dividing by the Avogadro number Ma- Note that threshold 
concentrations depend on the binding energy Ex (see, e.g., fig. SJ). For 
sake of definiteness, however, we can consider the case with Ex ~ 2fcT 
(see fig. HJ), where threshold concentrations are around c = 0.1% -j- 0.01%. 
Under the rough assumption that do is a couple of orders of magnitude 
smaller than the nucleus diameter (i.e., do ~ lOnm), a threshold molar 
concentration would be p ~ 0.1 -j- 1 pmolej litre, which is consistent with 
typical experimental values of nuclear protein concentrations (60, 6l|). Such 
estimation is very rough, but may help to further bridge this study with 
biological investigations. 

Starting from experimental results showing that chromatin fiber at large 
genomic distances, above l-2Mb, exhibits a leveling-off of the mean square 
distance between two DNA sites, a Gaussian "Random loop" polymer model 
was recently proposed (0). To explain these observations, the model intro- 
duced the idea of long range interactions along the polymer, where a given 
number P of couples of distant beads, randomly selected along the chain, 
are bound by an harmonic potential of amplitude k. The model investigated 
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the mean distance between sites and the size of loops, and showed that the 
presence of random loops on all length scales explains the leveling-off of the 
mean square distance. That model is in close similarity with the present 
work where cross interactions of a fraction, /, of DNA sites are mediated by 
the binding of molecular factors and by the formation of bridges of energy 
Ex- In our case the number of interacting site couples also depends on the 
concentration of mediators, c. Interestingly, the case mainly investigated in 
(j47l ) has n/kT = 1, which is in the energy range we consider, although our 
site interaction is short ranged, while in (JJ?) it is an harmonic potential. 
Nonetheless, the number of interacting site in our model would correspond, 



in the notation of ref. (1471 ). to a P which is a (non trivial) function of 



c. These considerations can illustrate the agreement between the discovery 



in (1471 ) of the leveling-off of the mean square distance and our finding, for 
instance, that for c above threshold, the polymer gyration radius doesn't 
attain the (self-avoiding) random walk value but saturates to much smaller 
values. 

In real cells, passive and active regulatory mechanisms can cooperate, 
adding further layers of complexity {jjj, 

a a a a a, while the list of 

molecules mediating chromatin organization is likely to include dedicated 
structural proteins, RNAs and, e.g., the transcription, replication, or repair 
machinery (21, 62, 63|). In our picture, specificity of interactions is obtained 



by specific molecular mediators binding to specific loci, while other general 
molecules could help the process. In the arrangement of specific binding sites 
along chromosomes and scaffolding elements, a variety of spatial patterns 
can be encoded (0) on an evolutionary time scale. Within a cell, patterns 
could be then dynamically selected by the combinatorial use of a set of me- 
diators via the ineluctable, yet probabilistic, laws of thermodynamics (0). 

Work supported by grant MIUR-FIRB RBNE01S29H, Network MRTN-CT- 
2003-504712. The authors declare no competing interests. 
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Figure Legends 

Figure Q] 

The figure shows two representative snapshots from our 3D computer simu- 
lations. In the left panel a Self- Avoiding Walk (SAW) polymer is shown, as 
it floats randomly within the assigned volume without forming stable loops. 
In the right panel the volume also contains a concentration c = 0.04% 
of Brownian molecules (yellow) having an affinity Ex = ^kT for a fraction 
/ = 1/3 of the polymers beads (shown in a darker shade) . As molecules can 
bind more than one polymer site, loops can be formed. However, they are 
stable, and confine the polymer in a closed territory (as in the case shown 
here), only if c is above a threshold value (see Figj2|). The SAW chains 
shown here comprise n = 64 beads. 

Figure [2] 

The equilibrium average gyration radius, R 2 , of the model polymer pictured 
in FigHJ depends on the affinity, Ex, of its binding sites for a set of molec- 
ular factors, on the concentration, c, of those factors, and on the fraction, 
/, of polymer beads which can bind molecules. R g represents the radius of 
a sphere enclosing the polymer: it has a maximum (R 2 = 1 in our normal- 
ization) when folding is random and a minimum when the polymer loops on 
itself in a lump (the horizontal red line is the radius of a compact sphere 
formed by the polymer). In the left panel, Rg is shown as a function of 
Ex, for a given value of c and / (here c = 0.04%, / = 1/3). For Ex below 
a threshold value, Et r ~ 3kT, R 2 is approximately 1 and the polymer is on 
average open. For Ex > E tr , R 2 , collapses, as the polymer forms a looped 
territory. In the central panel, R 2 is shown as a function of c, for a given 
Ex and / (here Ex = 4/sT, / = 1/3). Also in this case a threshold effects is 
observed (ct r ^ 0.01%), although a broader crossover region exists where the 
level of folding can be tuned. The right panel shows the sharp threshold 
of Rg as a function of / (f tr — 0.1, here c = 0.04%, Ex = 4fcT), illustrating 
that only in presence of multiple sites (i.e., above ftr) the polymer can be 
folded in loops. In all the above cases, loops are thermodynamically stable 
only above the threshold values, as a consequence a phase transition oc- 
curring in the system. By tuning affinities/concentrations, the cell can act, 
thus, on a "thermodynamic switch" to form and release loops and territories. 
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Figure [3] 

Lower panel: The average gyration radius, R g , relative to polymer model 
considered into the left panel of Fig. [21 is plotted as a function of the polymer 
chain length n. The picture shows the ratio R g {n) / R g (64:) (since n = 64 is 
the reference case dealt with in the rest of the paper) for n = 16, 32, 64, 128. 
In the phase where the polymer is "open", i.e., for Ex = ikT < E tr (see left 
panel of Fig. [2]), the average gyration radius, R g (filled circles), scales with 
n as a power law R g ~ n u with an exponent v ~ 0.6 (52|, 53|) (superimposed 



fit, dashed line). In the "looped" phase, i.e., for Ex = 4A;T > E tr , R g 
(empty circles) scales as n 1//3 (superimposed fit, long dashed line), showing 
that the polymer is lumped in a compact conformation. Upper panel: 
The threshold energy, Etr > relative to the left panel of Fig. [H is a function of 
the polymer chain length n. Here we plot the ratio Et r (n) / '£^(64) (where 
2i7tr (64) ~ 3kT). The superimposed fit is: Et r {n) = E^ + A/n B , where 
Et r (n) is the threshold energy for a chain of size n, E^ ~ 0.96-Et r (64) 
the extrapolated value for an infinitely long system, A ~ 0.47£7t r (64) and 
B ~ 0.5 a fitting coefficient and exponent. 



Figure [4] 

The state of the polymer /chromosome (see Fig[T]) at thermodynamic equi- 
librium is summarized by this phase diagram in a range of values of 'weak' 
biochemical affinities, Ex, and concentration, c, of its binding molecules 
(here / = 1/3). When Ex and c are below the transition line, E tr {c) (empty 
circles), the polymer is 'open' (as sketched in the inset) and no stable loops 
can be formed. Above threshold, instead, the system enters the region where 
the polymer is folded and 'looped' on itself. 



Figure [5] 

Two snapshots are shown from computer simulations of our two polymer 
model. In the left panel the polymers float independently within the as- 
signed volume. In the right panel the volume also includes a concentration, 
c = 0.3%, of molecules (yellow particles) which can bind simultaneously 
each polymer once at any of their specific loci (darker sites, here in a frac- 
tion / = 1/2 with affinity Ex = 4JcT). When c is above a threshold value 
(see Figj6|), as in the case shown, thermodynamically stable bridges can be 
formed between the polymers, which spontaneously tend to pair parts of or 
all their chains. 
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Figure [6] 

The equilibrium average distance, d 2 , of the two polymer model pictured 
in Figj5j is a function of the affinity, Ex, of their binding sites for diffus- 
ing molecules, of the concentration, c, of molecules, and of the fraction, /, 
of polymer binding sites. In the left panel, d 2 is plotted as a function 
of Ex (here c = 0.3%, / = 1/2). When Ex is smaller than a threshold, 
Ef r — 3.5kT, d 2 is maximal (d 2 = 1 in our normalization) and the poly- 
mers float independently one from the other. For Ex > E tr , d 2 drastically 
decreases, as the polymers are spontaneously colocalized. In the central 
panel, d 2 is shown as a function of c (here Ex = AkT, f = 1/2) and a 
threshold appears as well (cj r ~ 0.07%), sourrounded by a crossover region. 
In the right panel, the sharp threshold of d 2 as a function of / is shown 
(ftr — 0.4, here Ex = 4fcT, c = 0.3%): only multiple binding sites, above 
ftr, can achieve polymer colocalization. The mechanism driving polymer 
colocalization is an effective reciprocal attraction of thermodynamic origin, 
related to a phase transition: below threshold, molecules bridging by chance 
the polymers do not succeed in holding them in place; above threshold, 
bridges are thermodynamically stabilized. Molecular mediators act, then, 
as a "thermodynamic switch" to spontaneous formation and release of poly- 
mer stable contacts. 

Figure \7\ 

The relative positions of three polymers can be regulated by the concentra- 
tion of specific molecular factors. Inset A configuration is shown from our 
computer simulations of a three polymer model. A specific molecular factor 
can bind polymers 1 (pink) and 2 (blue), while a different factor binds poly- 
mers 2 and 3 (orange). Both molecular factors have here a concentration 
c = 0.13% (Ex = 4kT, f = 1/2), but they are not shown for clarity. Main 
panel The average distance between polymers 1-2, d\ 2 (squares), decreases 
as a function of c (the distance between 2-3 equals d 2 2 , and is not shown). 
As an indirect effect of the attraction within pairs 1-2 and 2-3, the distance 
between 1 and 3, df 3 (diamond), decreases as well, remaining, though, above 
d\ 2 . The three polymers, thus, tend to form a triangle with two short equal 
edges (corresponding to d\2 and c?23) and a longer edge (i.e., d\%). In general, 
by tuning c, Ex and / a variety of configurational patterns can be spon- 
taneously attained. Notably, since polymers 1 and 3 compete for bridging 
the sites of polymer 2, they physically interfere and d\ 2 ^ s larger than in the 
case of an isolated couple (yellow lower line, from Figj6]). A proper spatial 
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organization of chromosomes in territories and within territories could also 
help minimizing physical interference and entanglement. 

Figure [8] 

Schematic illustration of "thermodynamic switches" and their effects at dif- 
ferent levels of system organization. Top panel The assembly of chromo- 
some loops is thermodynamically possible only when the concentration /affinity 
of binders (circles) exceeds precise threshold values. At that point, previ- 
ously randomly and independently diffusing molecules and chromosomes 
spontaneously generate an organized pattern, in a process reversible by 
downregulation of the switch. Specific conformations can be attained by 
site specificity of a set of molecular mediators. Bottom panel Similar 
threshold and self-organization mechanisms act for establishing contact be- 
tween remote loci and, at a higher scale, relative positions of territories. A 
variety of patterns, encoded in the location of a number of binding sites 
along chromosomes, can be precisely selected via thermodynamics effects by 
a combinatorial use of a set of molecular mediators (rectangles). 
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